Search Result

Journals

Publication Years

Keywords

Please wait a minute...

For Selected:

Download Citations
EndNote Ris BibTeX

Toggle Thumbnails

Select

Text-to-image synthesis method based on multi-level structure generative adversarial networks

SUN Yu, LI Linyan, YE Zihan, HU Fuyuan, XI Xuefeng

Journal of Computer Applications 2019, 39 (11): 3204-3209. DOI: 10.11772/j.issn.1001-9081.2019051077

Abstract （459）

PDF （1012KB）（530）

Save

In recent years, the Generative Adversarial Network (GAN) has achieved remarkable success in text-to-image synthesis, but there are still problems such as edge blurring of images, unclear local textures, small sample variance. In view of the above shortcomings, based on Stack Generative Adversarial Network model (StackGAN++), a Multi-Level structure Generative Adversarial Networks (MLGAN) model was proposed, which is composed of multiple generators and discriminators in a hierarchical structure. Firstly, hierarchical structure coding method and word vector constraint were introduced to change the condition vector of generator of each level in the network, so that the edge details and local textures of the image were clearer and more vivid. Then, the generator and the discriminator were jointed by trained to approximate the real image distribution by using the generated image distribution of multiple levels, so that the variance of the generated sample became larger, and the diversity of the generated sample was increased. Finally, different scale images of the corresponding text were generated by generators of different levels. The experimental results show that the Inception scores of the MLGAN model reached 4.22 and 3.88 respectively on CUB and Oxford-102 datasets, which were respectively 4.45% and 3.74% higher than that of StackGAN++. The MLGAN model has improvement in solving edge blurring and unclear local textures of the generated image, and the image generated by the model is closer to the real image.

Reference | Related Articles | Metrics